air hockey
Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning
Chuck, Caleb, Feng, Fan, Qi, Carl, Shi, Chang, Agarwal, Siddhant, Zhang, Amy, Niekum, Scott
Hindsight relabeling is a powerful tool for overcoming sparsity in goal-conditioned reinforcement learning (GCRL), especially in certain domains such as navigation and locomotion. However, hindsight relabeling can struggle in object-centric domains. For example, suppose that the goal space consists of a robotic arm pushing a particular target block to a goal location. In this case, hindsight relabeling will give high rewards to any trajectory that does not interact with the block. However, these behaviors are only useful when the object is already at the goal -- an extremely rare case in practice. A dataset dominated by these kinds of trajectories can complicate learning and lead to failures. In object-centric domains, one key intuition is that meaningful trajectories are often characterized by object-object interactions such as pushing the block with the gripper. To leverage this intuition, we introduce Hindsight Relabeling using Interactions (HInt), which combines interactions with hindsight relabeling to improve the sample efficiency of downstream RL. However because interactions do not have a consensus statistical definition tractable for downstream GCRL, we propose a definition of interactions based on the concept of null counterfactual: a cause object is interacting with a target object if, in a world where the cause object did not exist, the target object would have different transition dynamics. We leverage this definition to infer interactions in Null Counterfactual Interaction Inference (NCII), which uses a "nulling'' operation with a learned model to infer interactions. NCII is able to achieve significantly improved interaction inference accuracy in both simple linear dynamics domains and dynamic robotic domains in Robosuite, Robot Air Hockey, and Franka Kitchen and HInt improves sample efficiency by up to 4x.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Dynamics as Prompts: In-Context Learning for Sim-to-Real System Identifications
Zhang, Xilun, Liu, Shiqi, Huang, Peide, Han, William Jongwon, Lyu, Yiqi, Xu, Mengdi, Zhao, Ding
Sim-to-real transfer remains a significant challenge in robotics due to the discrepancies between simulated and real-world dynamics. Traditional methods like Domain Randomization often fail to capture fine-grained dynamics, limiting their effectiveness for precise control tasks. In this work, we propose a novel approach that dynamically adjusts simulation environment parameters online using in-context learning. By leveraging past interaction histories as context, our method adapts the simulation environment dynamics to real-world dynamics without requiring gradient updates, resulting in faster and more accurate alignment between simulated and real-world performance. We validate our approach across two tasks: object scooping and table air hockey. In the sim-to-sim evaluations, our method significantly outperforms the baselines on environment parameter estimation by 80% and 42% in the object scooping and table air hockey setups, respectively. Furthermore, our method achieves at least 70% success rate in sim-to-real transfer on object scooping across three different objects. By incorporating historical interaction data, our approach delivers efficient and smooth system identification, advancing the deployment of robots in dynamic real-world scenarios. Demos are available on our project page: https://sim2real-capture.github.io/
- Leisure & Entertainment > Games > Computer Games (0.77)
- Leisure & Entertainment > Sports > Hockey (0.47)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)
Energy-based Contact Planning under Uncertainty for Robot Air Hockey
Jankowski, Julius, Marić, Ante, Liu, Puze, Tateo, Davide, Peters, Jan, Calinon, Sylvain
Planning robot contact often requires reasoning over a horizon to anticipate outcomes, making such planning problems computationally expensive. In this letter, we propose a learning framework for efficient contact planning in real-time subject to uncertain contact dynamics. We implement our approach for the example task of robot air hockey. Based on a learned stochastic model of puck dynamics, we formulate contact planning for shooting actions as a stochastic optimal control problem with a chance constraint on hitting the goal. To achieve online re-planning capabilities, we propose to train an energy-based model to generate optimal shooting plans in real time. The performance of the trained policy is validated %in experiments both in simulation and on a real-robot setup. Furthermore, our approach was tested in a competitive setting as part of the NeurIPS 2023 Robot Air Hockey Challenge.
Learning to Play Air Hockey with Model-Based Deep Reinforcement Learning
In the context of addressing the Robot Air Hockey Challenge 2023, we investigate the applicability of model-based deep reinforcement learning to acquire a policy capable of autonomously playing air hockey. Our agents learn solely from sparse rewards while incorporating self-play to iteratively refine their behaviour over time. The robotic manipulator is interfaced using continuous high-level actions for position-based control in the Cartesian plane while having partial observability of the environment with stochastic transitions. We demonstrate that agents are prone to overfitting when trained solely against a single playstyle, highlighting the importance of self-play for generalization to novel strategies of unseen opponents. Furthermore, the impact of the imagination horizon is explored in the competitive setting of the highly dynamic game of air hockey, with longer horizons resulting in more stable learning and better overall performance.
Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
Chuck, Caleb, Qi, Carl, Munje, Michael J., Li, Shuozhe, Rudolph, Max, Shi, Chang, Agarwal, Siddhant, Sikchi, Harshit, Peri, Abhinav, Dayal, Sarthak, Kuo, Evan, Mehta, Kavan, Wang, Anthony, Stone, Peter, Zhang, Amy, Niekum, Scott
Reinforcement Learning is a promising tool for learning complex policies even in fast-moving and object-interactive domains where human teleoperation or hard-coded policies might fail. To effectively reflect this challenging category of tasks, we introduce a dynamic, interactive RL testbed based on robot air hockey. By augmenting air hockey with a large family of tasks ranging from easy tasks like reaching, to challenging ones like pushing a block by hitting it with a puck, as well as goal-based and human-interactive tasks, our testbed allows a varied assessment of RL capabilities. The robot air hockey testbed also supports sim-to-real transfer with three domains: two simulators of increasing fidelity and a real robot system. Using a dataset of demonstration data gathered through two teleoperation systems: a virtualized control environment, and human shadowing, we assess the testbed with behavior cloning, offline RL, and RL from scratch.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- Europe > Portugal > Braga > Braga (0.04)
- Leisure & Entertainment > Sports > Hockey (1.00)
- Leisure & Entertainment > Games (0.68)
Fast Kinodynamic Planning on the Constraint Manifold with Deep Neural Networks
Kicki, Piotr, Liu, Puze, Tateo, Davide, Bou-Ammar, Haitham, Walas, Krzysztof, Skrzypczyński, Piotr, Peters, Jan
Motion planning is a mature area of research in robotics with many well-established methods based on optimization or sampling the state space, suitable for solving kinematic motion planning. However, when dynamic motions under constraints are needed and computation time is limited, fast kinodynamic planning on the constraint manifold is indispensable. In recent years, learning-based solutions have become alternatives to classical approaches, but they still lack comprehensive handling of complex constraints, such as planning on a lower-dimensional manifold of the task space while considering the robot's dynamics. This paper introduces a novel learning-to-plan framework that exploits the concept of constraint manifold, including dynamics, and neural planning methods. Our approach generates plans satisfying an arbitrary set of constraints and computes them in a short constant time, namely the inference time of a neural network. This allows the robot to plan and replan reactively, making our approach suitable for dynamic environments. We validate our approach on two simulated tasks and in a demanding real-world scenario, where we use a Kuka LBR Iiwa 14 robotic arm to perform the hitting movement in robotic Air Hockey.
- Europe > Poland > Greater Poland Province > Poznań (0.04)
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- (5 more...)
The key to smarter robot collaborators may be more simplicity
Think of all the subconscious processes you perform while you're driving. As you take in information about the surrounding vehicles, you're anticipating how they might move and thinking on the fly about how you'd respond to those maneuvers. You may even be thinking about how you might influence the other drivers based on what they think you might do. If robots are to integrate seamlessly into our world, they'll have to do the same. Now researchers from Stanford University and Virginia Tech have proposed a new technique to help robots perform this kind of behavioral modeling, which they will present at the annual international Conference on Robot Learning next week.
- North America > United States > Virginia (0.25)
- North America > Canada > Ontario > Toronto (0.16)